An Empirical Evaluation of Evaluation Metrics of Procedurally Generated Mario Levels
نویسندگان
چکیده
There are several approaches in the literature for automatically generating Infinite Mario Bros levels. The evaluation of such approaches is often performed solely with computational metrics such as leniency and linearity. While these metrics are important for an initial exploratory evaluation of the content generated, it is not clear whether they are able to capture the player’s perception of the content generated. In this paper we evaluate several of the commonly used computational metrics. Namely, we perform a systematic user study with procedural content generation systems and compare the insights gained from our user study with those gained from analyzing the computational metric values. The results of our experiment suggest that current computational metrics should not be used in lieu of user studies for evaluating content generated by computer programs.
منابع مشابه
Using the Taxonomy and the Metrics: What to Study When and Why; Comment on “Metrics and Evaluation Tools for Patient Engagement in Healthcare Organization- and System-Level Decision-Making: A Systematic Review”
Dukhanin and colleagues’ taxonomy of metrics for patient engagement at the organizational and system levels has great potential for supporting more careful and useful evaluations of this ever-growing phenomenon. This commentary highlights the central importance to the taxonomy of metrics assessing the extent of meaningful participation in decision-making by patients, consumers and community mem...
متن کاملGenerating Maps Using Markov Chains
In this paper we outline a method of procedurally generating maps using Markov Chains. Our method attempts to learn what makes a “good” map from a set of given human-authored maps, and then uses those learned patterns to generate new maps. We present an empirical evaluation using the game Super Mario Bros., showing encouraging results.
متن کاملReview of ranked-based and unranked-based metrics for determining the effectiveness of search engines
Purpose: Traditionally, there have many metrics for evaluating the search engine, nevertheless various researchers’ proposed new metrics in recent years. Aware of this new metrics is essential to conduct research on evaluation of the search engine field. So, the purpose of this study was to provide an analysis of important and new metrics for evaluating the search engines. Methodology: This is ...
متن کاملMetrics and Evaluation Tools for Patient Engagement in Healthcare Organization- and System-Level Decision-Making: A Systematic Review
Background Patient, public, consumer, and community (P2C2) engagement in organization-, community-, and systemlevel healthcare decision-making is increasing globally, but its formal evaluation remains challenging. To define a taxonomy of possible P2C2 engagement metrics and compare existing evaluation tools against this taxonomy, we conducted a systematic review. Methods A broad search strate...
متن کاملThe Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language
Machine Translation Evaluation Metrics (MTEMs) are the central core of Machine Translation (MT) engines as they are developed based on frequent evaluation. Although MTEMs are widespread today, their validity and quality for many languages is still under question. The aim of this research study was to examine the validity and assess the quality of MTEMs from Lexical Similarity set on machine tra...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015